Model Selection

Low VRAM Optimization

# Low VRAM Optimization

A ControlNet PEFT LoRA model based on HiDream-I1-Full, supporting text-to-image and image-to-image conversion

Image Generation

A LoRA fine-tuned version based on the Mochi-1 preview model, focusing on text-to-video generation tasks

GLM4 32B Neon V2

A roleplay fine-tuned version based on GLM-4-32B-0414, with excellent performance, distinctive personality, diverse styles, and elegant writing.

Large Language Model

Transformers English

The 4-bit AWQ quantized version of Orpheus-3b FT, optimized for text-to-speech tasks and supporting voice cloning functionality.

Speech Synthesis English

YaTharThShaRma999

Deepseek V3 0324 GGUF UD

DeepSeek-V3-0324 is a dynamically quantized version provided by Unsloth, supporting inference frameworks like llama.cpp and LMStudio.

Large Language Model English

Deepseek V3 0324 GGUF

The current V3-0324 model is the best-performing quantized version in its size category, significantly reducing volume while maintaining performance close to Q8_0

Large Language Model Other

GGUF-format quantized version of Stable Diffusion XL, offering different quantization levels to accommodate various hardware configurations.

HyperX-Sentience

This is a LoRA trained for the Wan2.1 14B video generation model, suitable for text-to-video and image-to-video tasks.

Video Processing Supports Multiple Languages

Cat Text To Video 2.3b

A text-to-video model based on conditional enhancement, extending generated segments and achieving smooth transitions through temporal condition transformers, supporting prompt interpolation functionality

Text-to-Video English

Minicpm O 2 6 Int4

The int4 quantized version of MiniCPM-o 2.6, significantly reducing GPU VRAM usage while supporting multimodal processing capabilities.

Transformers Other

FLUX.1-dev is a text-to-image generation model based on Stable Diffusion technology, supporting LoRA fine-tuning, suitable for creative image generation tasks.

Image Generation

The Illustrious model is a text-to-image AI model capable of generating high-quality images from text descriptions.

Text-to-Image English

Controlnet Kohaku Canny Sdxl Fp16

A ControlNet model based on Stable Diffusion XL, specializing in precise image generation control through Canny edge detection

Image Generation

Hunyuanvideo Gguf

GGUF quantized version of Tencent's Phantom Video model, designed specifically for ComfyUI for text-to-video generation tasks

FLUX.1 Fill Dev GGUF

FLUX.1-Fill-dev is a text-to-image generation model based on FLUX technology, specializing in image inpainting tasks.

Text-to-Image English

Aria Sequential Mlp Bnb Nf4

A BitsAndBytes NF4 quantized version based on Aria-sequential_mlp, suitable for image-to-text tasks with approximately 15.5 GB VRAM requirement.

Flux.1 Lite 8B Alpha

Flux.1 Lite is an 8B-parameter Transformer model distilled from the FLUX.1-dev model, maintaining the same precision (bfloat16) while reducing memory usage by 7GB and improving runtime speed by 23%.

A video generation model based on CogVideoX-5b, capable of producing high-quality video content from text descriptions

Text-to-Video English

CogVideoX is the open-source version of the video generation model from Qingying. The 2B version is an entry-level model that balances compatibility with low operational and development costs.

Text-to-Video English

Chromafur Alpha Gguf

ChromaFur Alpha is a text-to-image generation model converted to GGUF format, suitable for low-end GPUs or users who prefer fast loading.

Image Generation

CogVideoX is an open-source video generation model originating from Qingying. The 2B version is an entry-level model, balancing compatibility with low operational and development costs.

Text-to-Video English

Herobophades 3x7B

HeroBophades-3x7B is an experimental Mixture of Experts (LLM) model built using mergekit, designed to run in 4-bit mode on GPUs with 12GB VRAM.

Large Language Model

Erosumika 7B V3 7.1bpw Exl2

Erosumika-7B-v3 is a 7.1bpw exl2 quantized language model suitable for running 16k context on GPUs with 8GB VRAM. It was created by fusing multiple models using the DARE TIES method, primarily for entertainment-oriented fictional writing.

Large Language Model

Transformers English

A Stable Diffusion model specifically designed for generating anime/manga storyboards

Image Generation Other

parsee-mizuhashi

Animatediff Motion Adapter V1 5 3

AnimateDiff is a technology that leverages existing Stable Diffusion text-to-image models to create videos by inserting motion module layers to achieve coherent motion between image frames.

Video Processing

Show-1 is an efficient text-to-video generation model that combines the advantages of pixel and latent space diffusion models, capable of producing high-quality videos with precise text alignment.

Video Processing

Show-1 is an efficient text-to-video generation model that combines the strengths of pixel and latent space diffusion models to produce high-quality videos closely aligned with text prompts.

Video Processing

Sygil Diffusion

A fine-tuned version based on Stable Diffusion, supporting multi-level namespace control for image generation elements, effectively avoiding context confusion issues

Image Generation Supports Multiple Languages

Colorjizz 512px

A 512px resolution color style model based on Stable Diffusion 1.5, activated by the prompt 'colorjizz' to generate vibrant color effects through 130 training images

Image Generation

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase